Goto

Collaborating Authors

 Haikou


Spiking Neural Networks: The Future of Brain-Inspired Computing

Aribe, Sales G. Jr

arXiv.org Artificial Intelligence

Spiking Neural Networks (SNNs) represent the latest generation of neural computation, offering a brain-inspired alternative to conventional Artificial Neural Networks (ANNs). Unlike ANNs, which depend on continuous-valued signals, SNNs operate using distinct spike events, making them inherently more energy-efficient and temporally dynamic. This study presents a comprehensive analysis of SNN design models, training algorithms, and multi-dimensional performance metrics, including accuracy, energy consumption, latency, spike count, and convergence behavior. Key neuron models such as the Leaky Integrate-and-Fire (LIF) and training strategies, including surrogate gradient descent, ANN-to-SNN conversion, and Spike-Timing Dependent Plasticity (STDP), are examined in depth. Results show that surrogate gradient-trained SNNs closely approximate ANN accuracy (within 1-2%), with faster convergence by the 20th epoch and latency as low as 10 milliseconds. Converted SNNs also achieve competitive performance but require higher spike counts and longer simulation windows. STDP-based SNNs, though slower to converge, exhibit the lowest spike counts and energy consumption (as low as 5 millijoules per inference), making them optimal for unsupervised and low-power tasks. These findings reinforce the suitability of SNNs for energy-constrained, latency-sensitive, and adaptive applications such as robotics, neuromorphic vision, and edge AI systems. While promising, challenges persist in hardware standardization and scalable training. This study concludes that SNNs, with further refinement, are poised to propel the next phase of neuromorphic computing.


TripTide: A Benchmark for Adaptive Travel Planning under Disruptions

Karmakar, Priyanshu, Chaudhuri, Soumyabrata, Mallick, Shubhojit, Gupta, Manish, Jana, Abhik, Ghosh, Shreya

arXiv.org Artificial Intelligence

Recent efforts like TripCraft and TravelPlanner have advanced the use of Large Language Models ( LLMs) for personalized, constraint aware travel itinerary generation. Yet, real travel often faces disruptions. To address this, we present TripTide, the first benchmark evaluating LLM's ability to revise itineraries under realistic disruptions. TripTide models key dimensions such as disruption severity and traveler tolerance, enabling nuanced assessment of LLM adaptability to events like flight cancellations, weather closures, or overbooked attractions. We conduct a threefold evaluation. First, we introduce automatic metrics including Preservation of Intent (how well the revised plan maintains feasibility and goals), Responsiveness (promptness and appropriateness of disruption handling), and Adaptability (semantic, spatial, and sequential divergence between original and revised plans). Second, we apply an LLM-as-a-judge approach to automatically assess revision quality. Third, we perform manual expert evaluation to verify whether revisions preserve semantic, spatial, sequential, and responsive aspects. Our experiments show that LLMs maintain strong sequential consistency and semantic stability, while spatial deviations are larger for shorter trips but decrease with longer ones, indicating that extended plans encourage better geographic coherence. However, disruption-handling ability declines as plan length increases, highlighting limits in LLM robustness. TripTide establishes a benchmark for evaluating adaptability, personalization, and resilience in LLM-based travel planning under real-world uncertainty.


A Novel GPT-Based Framework for Anomaly Detection in System Logs

Zhang, Zeng, Yin, Wenjie, Li, Xiaoqi

arXiv.org Artificial Intelligence

Identification of anomalous events within system logs constitutes a pivotal element within the frame- work of cybersecurity defense strategies. However, this process faces numerous challenges, including the management of substantial data volumes, the distribution of anomalies, and the precision of con- ventional methods. To address this issue, the present paper puts forward a proposal for an intelligent detection method for system logs based on Genera- tive Pre-trained Transformers (GPT). The efficacy of this approach is attributable to a combination of structured input design and a Focal Loss op- timization strategy, which collectively result in a substantial enhancement of the performance of log anomaly detection. The initial approach involves the conversion of raw logs into event ID sequences through the use of the Drain parser. Subsequently, the Focal Loss loss function is employed to address the issue of class imbalance. The experimental re- sults demonstrate that the optimized GPT-2 model significantly outperforms the unoptimized model in a range of key metrics, including precision, recall, and F1 score. In specific tasks, comparable or superior performance has been demonstrated to that of the GPT-3.5 API.


GeoSketch: A Neural-Symbolic Approach to Geometric Multimodal Reasoning with Auxiliary Line Construction and Affine Transformation

Weng, Shichao, Wang, Zhiqiang, Zhou, Yuhua, Lu, Rui, Liu, Ting, Teng, Zhiyang, Liu, Xiaozhang, Liu, Hanmeng

arXiv.org Artificial Intelligence

Geometric Problem Solving (GPS) poses a unique challenge for Multimodal Large Language Models (MLLMs), requiring not only the joint interpretation of text and diagrams but also iterative visuospatial reasoning. While existing approaches process diagrams as static images, they lack the capacity for dynamic manipulation--a core aspect of human geometric reasoning involving auxiliary line construction and affine transformations. GeoSketch integrates: (1) a Perception module that abstracts diagrams into structured logic forms, (2) a Symbolic Reasoning module that applies geometric theorems to decide the next deductive step, and (3) a Sketch Action module that executes operations such as drawing auxiliary lines or applying transformations, thereby updating the diagram in a closed loop. To train this agent, we develop a two-stage pipeline: supervised fine-tuning on 2,000 symbolic-curated trajectories followed by reinforcement learning with dense, symbolic rewards to enhance robustness and strategic exploration. To evaluate this paradigm, we introduce the GeoSketch Benchmark, a high-quality set of 390 geometry problems requiring auxiliary construction or affine transformations. Experiments on strong MLLM baselines demonstrate that GeoSketch significantly improves stepwise reasoning accuracy and problem-solving success over static perception methods. Work done during an internship at Hainan University. With the advent of Multimodal Large Language Models (MLLMs) (OpenAI, 2024; Comanici et al., 2025; Hong et al., 2025), Geometric Problem Solving (GPS) presents a unique challenge to MLLMs, demanding not only the joint understanding of text and diagrams but also rigorous, multi-step deductive reasoning (Zhang et al., 2023; Qiao et al., 2024; He et al., 2025). While modern MLLMs can ingest multimodal inputs, their reasoning output remains confined to static text. This limits the use of dynamic and visuospatial strategies in geometric problem solving, which becomes particularly evident in complex tasks requiring multi-stage manipulation.


SPORTSQL: An Interactive System for Real-Time Sports Reasoning and Visualization

Martinez, Sebastian, Ahuja, Naman, Bardoliya, Fenil, Bryan, Chris, Gupta, Vivek

arXiv.org Artificial Intelligence

We present a modular, interactive system, SPORTSQL, for natural language querying and visualization of dynamic sports data, with a focus on the English Premier League (EPL). The system translates user questions into executable SQL over a live, temporally indexed database constructed from real-time Fantasy Premier League (FPL) data. It supports both tabular and visual outputs, leveraging the symbolic reasoning capabilities of Large Language Models (LLMs) for query parsing, schema linking, and visualization selection. To evaluate system performance, we introduce the Dynamic Sport Question Answering benchmark (DSQABENCH), comprising 1,700+ queries annotated with SQL programs, gold answers, and database snapshots. Our demo highlights how non-expert users can seamlessly explore evolving sports statistics through a natural, conversational interface.


ChineseHarm-Bench: A Chinese Harmful Content Detection Benchmark

Liu, Kangwei, Cheng, Siyuan, Tian, Bozhong, Liang, Xiaozhuan, Yin, Yuyang, Han, Meng, Zhang, Ningyu, Hooi, Bryan, Chen, Xi, Deng, Shumin

arXiv.org Artificial Intelligence

Large language models (LLMs) have been increasingly applied to automated harmful content detection tasks, assisting moderators in identifying policy violations and improving the overall efficiency and accuracy of content review. However, existing resources for harmful content detection are predominantly focused on English, with Chinese datasets remaining scarce and often limited in scope. We present a comprehensive, professionally annotated benchmark for Chinese content harm detection, which covers six representative categories and is constructed entirely from real-world data. Our annotation process further yields a knowledge rule base that provides explicit expert knowledge to assist LLMs in Chinese harmful content detection. In addition, we propose a knowledge-augmented baseline that integrates both human-annotated knowledge rules and implicit knowledge from large language models, enabling smaller models to achieve performance comparable to state-of-the-art LLMs. Code and data are available at https://github.com/zjunlp/ChineseHarm-bench.


Decoding the Multimodal Maze: A Systematic Review on the Adoption of Explainability in Multimodal Attention-based Models

Kibria, Md Raisul, Lafond, Sébastien, Arslan, Janan

arXiv.org Artificial Intelligence

Multimodal learning has witnessed remarkable advancements in recent years, particularly with the integration of attention-based models, leading to significant performance gains across a variety of tasks. Parallel to this progress, the demand for explainable artificial intelligence (XAI) has spurred a growing body of research aimed at interpreting the complex decision-making processes of these models. This systematic literature review analyzes research published between January 2020 and early 2024 that focuses on the explainability of multimodal models. Framed within the broader goals of XAI, we examine the literature across multiple dimensions, including model architecture, modalities involved, explanation algorithms and evaluation methodologies. Our analysis reveals that the majority of studies are concentrated on vision-language and language-only models, with attention-based techniques being the most commonly employed for explanation. However, these methods often fall short in capturing the full spectrum of interactions between modalities, a challenge further compounded by the architectural heterogeneity across domains. Importantly, we find that evaluation methods for XAI in multimodal settings are largely non-systematic, lacking consistency, robustness, and consideration for modality-specific cognitive and contextual factors. Based on these findings, we provide a comprehensive set of recommendations aimed at promoting rigorous, transparent, and standardized evaluation and reporting practices in multimodal XAI research. Our goal is to support future research in more interpretable, accountable, and responsible mulitmodal AI systems, with explainability at their core.


Divide-Then-Rule: A Cluster-Driven Hierarchical Interpolator for Attribute-Missing Graphs

Hu, Yaowen, Tu, Wenxuan, Liu, Yue, Li, Miaomiao, Lu, Wenpeng, Luo, Zhigang, Liu, Xinwang, Chen, Ping

arXiv.org Artificial Intelligence

Deep graph clustering (DGC) for attribute-missing graphs is an unsupervised task aimed at partitioning nodes with incomplete attributes into distinct clusters. Addressing this challenging issue is vital for practical applications. However, research in this area remains underexplored. Existing imputation methods for attribute-missing graphs often fail to account for the varying amounts of information available across node neighborhoods, leading to unreliable results, especially for nodes with insufficient known neighborhood. To address this issue, we propose a novel method named Divide-Then-Rule Graph Completion (DTRGC). This method first addresses nodes with sufficient known neighborhood information and treats the imputed results as new knowledge to iteratively impute more challenging nodes, while leveraging clustering information to correct imputation errors. Specifically, Dynamic Cluster-Aware Feature Propagation (DCFP) initializes missing node attributes by adjusting propagation weights based on the clustering structure. Subsequently, Hierarchical Neighborhood-aware Imputation (HNAI) categorizes attribute-missing nodes into three groups based on the completeness of their neighborhood attributes. The imputation is performed hierarchically, prioritizing the groups with nodes that have the most available neighborhood information. The cluster structure is then used to refine the imputation and correct potential errors. Finally, Hop-wise Representation Enhancement (HRE) integrates information across multiple hops, thereby enriching the expressiveness of node representations. Experimental results on six widely used graph datasets show that DTRGC significantly improves the clustering performance of various DGC methods under attribute-missing graphs.


Information Security Based on LLM Approaches: A Review

Gong, Chang, Li, Zhongwen, Li, Xiaoqi

arXiv.org Artificial Intelligence

Information security is facing increasingly severe challenges, and traditional protection means are difficult to cope with complex and changing threats. In recent years, as an emerging intelligent technology, large language models (LLMs) have shown a broad application prospect in the field of information security. In this paper, we focus on the key role of LLM in information security, systematically review its application progress in malicious behavior prediction, network threat analysis, system vulnerability detection, malicious code identification, and cryptographic algorithm optimization, and explore its potential in enhancing security protection performance. Based on neural networks and Transformer architecture, this paper analyzes the technical basis of large language models and their advantages in natural language processing tasks. It is shown that the introduction of large language modeling helps to improve the detection accuracy and reduce the false alarm rate of security systems. Finally, this paper summarizes the current application results and points out that it still faces challenges in model transparency, interpretability, and scene adaptability, among other issues. It is necessary to explore further the optimization of the model structure and the improvement of the generalization ability to realize a more intelligent and accurate information security protection system.


Overview of the NLPCC 2025 Shared Task: Gender Bias Mitigation Challenge

Li, Yizhi, Zhang, Ge, Hong, Hanhua, Wang, Yiwen, Lin, Chenghua

arXiv.org Artificial Intelligence

As natural language processing for gender bias becomes a significant interdisciplinary topic, the prevalent data-driven techniques, such as pre-trained language models, suffer from biased corpus. This case becomes more obvious regarding those languages with less fairness-related computational linguistic resources, such as Chinese. To this end, we propose a Chinese cOrpus foR Gender bIas Probing and Mitigation (CORGI-PM), which contains 32.9k sentences with high-quality labels derived by following an annotation scheme specifically developed for gender bias in the Chinese context. It is worth noting that CORGI-PM contains 5.2k gender-biased sentences along with the corresponding bias-eliminated versions rewritten by human annotators. We pose three challenges as a shared task to automate the mitigation of textual gender bias, which requires the models to detect, classify, and mitigate textual gender bias. In the literature, we present the results and analysis for the teams participating this shared task in NLPCC 2025.